Cocktail Party Processing
نویسندگان
چکیده
Speech segregation, or the cocktail party problem, has proven to be extremely challenging. This presentation describes a computational auditory scene analysis (CASA) approach to the cocktail party problem. This approach performs auditory segmentation and grouping in a two-dimensional time-frequency representation that encodes proximity in frequency and time, periodicity, amplitude modulation, and onset/offset. In segmentation, our model decomposes the input mixture into contiguous time-frequency segments. Grouping is first performed for voiced speech where detected pitch contours are used to group voiced segments into a target stream and the background. In grouping voiced speech, resolved and unresolved harmonics are dealt with differently. Grouping of unvoiced segments is based on supervised classification of acoustic-phonetic features. This CASA approach has led to major advances towards solving the cocktail party problem.
منابع مشابه
A Biologically Motivated Solution to the Cocktail Party Problem
We present a new approach to the cocktail party problem that uses a cortronic artificial neural network architecture (Hecht-Nielsen, 1998) as the front end of a speech processing system. Our approach is novel in three important respects. First, our method assumes and exploits detailed knowledge of the signals we wish to attend to in the cocktail party environment. Second, our goal is to provide...
متن کاملDeep Transform: Cocktail Party Source Separation via Probabilistic Re-Synthesis
In cocktail party listening scenarios, the human brain is able to separate competing speech signals. However, the signal processing implemented by the brain to perform cocktail party listening is not well understood. Here, we trained two separate convolutive autoencoder deep neural networks (DNN) to separate monaural and binaural mixtures of two concurrent speech streams. We then used these DNN...
متن کاملL'amorçage sémantique masqué en situation de cocktail party (Masked semantic priming in cocktail party situation) [in French]
________________________________________________________________________________________________________ Masked semantic priming in cocktail party situation The present study aimed at testing automatic semantic processing in the auditory modality using the cocktail party situation. Participants had to perform a lexical decision task on a target item embedded in a multi-talker babble. This babbl...
متن کاملImproved Cocktail - Party Processing
The human auditory system is able to focus on one speech signal and ignore other speech signals in an auditory scene where several conversations are taking place. This ability of the human auditory system is referred to as the “cocktail-party effect”. This property of human hearing is partly made possible by binaural listening. Interaural time differences (ITDs) and interaural level differences...
متن کاملBinaural Scene Analysis and Automatic Speech Recognition
The human auditory system is known to be able to easily analyze and decompose complex acoustic scenes into its constituent acoustic sources. This requires the integration of a multitude of acoustic cues, a phenomenon that is often referred to as cocktail-party processing. Auditory Scene Analysis, especially the segregation and comprehension of concurrent speakers, is one of the key features in ...
متن کاملCocktail Party Processing via Structured Prediction
While human listeners excel at selectively attending to a conversation in a cocktail party, machine performance is still far inferior by comparison. We show that the cocktail party problem, or the speech separation problem, can be effectively approached via structured prediction. To account for temporal dynamics in speech, we employ conditional random fields (CRFs) to classify speech dominance ...
متن کامل